Wczytanie bibliotek

suppressWarnings(library(fansi))
suppressWarnings(library(dplyr))
## 
## DoƂączanie pakietu: 'dplyr'
## Następujące obiekty zostaƂy zakryte z 'package:stats':
## 
##     filter, lag
## Następujące obiekty zostaƂy zakryte z 'package:base':
## 
##     intersect, setdiff, setequal, union
suppressWarnings(library(plyr))
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
## 
## DoƂączanie pakietu: 'plyr'
## Następujące obiekty zostaƂy zakryte z 'package:dplyr':
## 
##     arrange, count, desc, failwith, id, mutate, rename, summarise,
##     summarize
suppressWarnings(library(vegan))
## Ɓadowanie wymaganego pakietu: permute
## Ɓadowanie wymaganego pakietu: lattice
## This is vegan 2.6-4
suppressWarnings(library(tidyverse))
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ readr     2.1.4
## ✔ ggplot2   3.4.2     ✔ stringr   1.5.0
## ✔ lubridate 1.9.2     ✔ tibble    3.2.1
## ✔ purrr     1.0.1     ✔ tidyr     1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ plyr::arrange()   masks dplyr::arrange()
## ✖ purrr::compact()  masks plyr::compact()
## ✖ plyr::count()     masks dplyr::count()
## ✖ plyr::desc()      masks dplyr::desc()
## ✖ plyr::failwith()  masks dplyr::failwith()
## ✖ dplyr::filter()   masks stats::filter()
## ✖ plyr::id()        masks dplyr::id()
## ✖ dplyr::lag()      masks stats::lag()
## ✖ plyr::mutate()    masks dplyr::mutate()
## ✖ plyr::rename()    masks dplyr::rename()
## ✖ plyr::summarise() masks dplyr::summarise()
## ✖ plyr::summarize() masks dplyr::summarize()
## â„č Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
suppressWarnings(library(ggplot2))
suppressWarnings(library(phyloseq))
suppressWarnings(library(qiime2R))
suppressWarnings(library(picante))
## Ɓadowanie wymaganego pakietu: ape
## 
## DoƂączanie pakietu: 'ape'
## 
## Następujący obiekt zostaƂ zakryty z 'package:dplyr':
## 
##     where
## 
## Ɓadowanie wymaganego pakietu: nlme
## 
## DoƂączanie pakietu: 'nlme'
## 
## Następujący obiekt zostaƂ zakryty z 'package:dplyr':
## 
##     collapse

Wczytanie danych

Wczytuję dane i tworzę obiekt Phyloseq

phyloseq = qza_to_phyloseq(features = "../table_output.qza", tree = "../rooted_tree.qza",taxonomy = "../taxonomy.qza", metadata = "../sample_metadata.tsv")

Podstawowe informacje o danych

Sprawdzam liczbę taksonów i próbek, maksymalną i minimalną liczbę odczytów w próbce, poziomy taksonomiczne

ntaxa(phyloseq)
## [1] 1731
nsamples(phyloseq)
## [1] 24
max(sample_sums(phyloseq))
## [1] 24641
min(sample_sums(phyloseq))
## [1] 10197
rank_names(phyloseq)
## [1] "Kingdom" "Phylum"  "Class"   "Order"   "Family"  "Genus"   "Species"

Filtrowanie danych - przedstawiciele Euglenida

Wybieram tylko te odczyty, ktĂłrych przedstawiciele naleĆŒÄ… do Euglenida

phyloseq_euglenida <- subset_taxa(phyloseq, Order == "Euglenida")

Sprawdzam liczbę taksonów i próbek, maksymalną i minimalną liczbę odczytów w próbce, poziomy taksonomiczne w filtrowanych danych

ntaxa(phyloseq_euglenida)
## [1] 1609
nsamples(phyloseq_euglenida)
## [1] 24
max(sample_sums(phyloseq_euglenida))
## [1] 21827
min(sample_sums(phyloseq_euglenida))
## [1] 10188
rank_names(phyloseq_euglenida)
## [1] "Kingdom" "Phylum"  "Class"   "Order"   "Family"  "Genus"   "Species"

W porĂłwnaniu do kompletnych danych, przefiltrowane dane są mniejsze - zawierają mniej taksonĂłw, ale takĆŒe mają mniejszą maksymalną i minimalną liczbę odczytĂłw w prĂłbce. Poziomy taksonomiczne pozostaƂy bez zmian. Do dalszych analiz będę zatem uĆŒywaƂa odczytĂłw, ktĂłre naleĆŒÄ… do Euglenida.

Ɓączenie ASV z tych samych gatunków

Sprawdzam przyporządkowanie taksonomiczne

tax_table(phyloseq_euglenida)[1:20, 1:7]
## Taxonomy Table:     [20 taxa by 7 taxonomic ranks]:
##                                  Kingdom     Phylum     Class     Order      
## 3e08234b0578723bf2199340c3d94efe "Eukaryota" "Excavata" "Discoba" "Euglenida"
## df69ce7f69ac5a14488fec0a045f5125 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 2c265920e31f9203159d8daf3aba1f5a "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 7044544281b955ff0ceaee16373f3111 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## a8d367ff3e5df6c0b9c9f0e16bb9a46d "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 564cb6a44928a0e1035b9bd23d456b77 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 354d1f84c044a937703a77d32e6aebc1 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 79a18eb5f8af707d72fd8e89d4b66edd "Eukaryota" "Excavata" "Discoba" "Euglenida"
## c3edc4aabd39575e3530b4f58b1ff7f3 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 781a37fde2949eb9c7339c5bc6926ac5 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 5f27f7c1072be243a443e28f0021bc45 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 50a53c1814078e61b3e325dbfd4b8871 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 16c06ae6836681f8747edd534ea2c7e2 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## b95c684c45803f0bf619a12e07e3286a "Eukaryota" "Excavata" "Discoba" "Euglenida"
## a1beb1b3de23e201fcbb4066a9f03ac1 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## e9d88e75a95b3e9ce6199c48d65af934 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 938cbddf28f0773988f58a9327ac9167 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## de3101e77798ba61df4cc177173529fd "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 4d4480be2894f09f18c03371a81307fa "Eukaryota" "Excavata" "Discoba" "Euglenida"
## b05bb27ac9ddea225db70bfd60676f35 "Eukaryota" "Excavata" "Discoba" "Euglenida"
##                                  Family        Genus         
## 3e08234b0578723bf2199340c3d94efe "Euglenaceae" "Strombomonas"
## df69ce7f69ac5a14488fec0a045f5125 "Euglenaceae" "Strombomonas"
## 2c265920e31f9203159d8daf3aba1f5a "Euglenaceae" "Strombomonas"
## 7044544281b955ff0ceaee16373f3111 "Euglenaceae" "Strombomonas"
## a8d367ff3e5df6c0b9c9f0e16bb9a46d "Euglenaceae" "Strombomonas"
## 564cb6a44928a0e1035b9bd23d456b77 "Euglenaceae" "Strombomonas"
## 354d1f84c044a937703a77d32e6aebc1 "Euglenaceae" "Strombomonas"
## 79a18eb5f8af707d72fd8e89d4b66edd "Euglenaceae" "Strombomonas"
## c3edc4aabd39575e3530b4f58b1ff7f3 "Euglenaceae" NA            
## 781a37fde2949eb9c7339c5bc6926ac5 "Euglenaceae" "Strombomonas"
## 5f27f7c1072be243a443e28f0021bc45 "Euglenaceae" "Strombomonas"
## 50a53c1814078e61b3e325dbfd4b8871 "Euglenaceae" "Strombomonas"
## 16c06ae6836681f8747edd534ea2c7e2 "Euglenaceae" "Strombomonas"
## b95c684c45803f0bf619a12e07e3286a "Euglenaceae" "Strombomonas"
## a1beb1b3de23e201fcbb4066a9f03ac1 "Euglenaceae" "Strombomonas"
## e9d88e75a95b3e9ce6199c48d65af934 "Euglenaceae" "Strombomonas"
## 938cbddf28f0773988f58a9327ac9167 "Euglenaceae" "Strombomonas"
## de3101e77798ba61df4cc177173529fd "Euglenaceae" "Strombomonas"
## 4d4480be2894f09f18c03371a81307fa "Euglenaceae" "Strombomonas"
## b05bb27ac9ddea225db70bfd60676f35 "Euglenaceae" "Strombomonas"
##                                  Species                            
## 3e08234b0578723bf2199340c3d94efe "D_10__Strombomonas fluviatilis"   
## df69ce7f69ac5a14488fec0a045f5125 "D_10__Strombomonas maxima"        
## 2c265920e31f9203159d8daf3aba1f5a NA                                 
## 7044544281b955ff0ceaee16373f3111 "D_10__Strombomonas maxima"        
## a8d367ff3e5df6c0b9c9f0e16bb9a46d "D_10__Strombomonas maxima"        
## 564cb6a44928a0e1035b9bd23d456b77 "D_10__Strombomonas fluviatilis"   
## 354d1f84c044a937703a77d32e6aebc1 "D_10__Strombomonas acuminata"     
## 79a18eb5f8af707d72fd8e89d4b66edd "D_10__Strombomonas acuminata"     
## c3edc4aabd39575e3530b4f58b1ff7f3 NA                                 
## 781a37fde2949eb9c7339c5bc6926ac5 "D_10__Strombomonas verrucosa"     
## 5f27f7c1072be243a443e28f0021bc45 "D_10__Strombomonas verrucosa"     
## 50a53c1814078e61b3e325dbfd4b8871 "D_10__Strombomonas verrucosa"     
## 16c06ae6836681f8747edd534ea2c7e2 "D_10__Strombomonas triquetra"     
## b95c684c45803f0bf619a12e07e3286a "D_10__Strombomonas triquetra"     
## a1beb1b3de23e201fcbb4066a9f03ac1 "D_10__Strombomonas triquetra"     
## e9d88e75a95b3e9ce6199c48d65af934 NA                                 
## 938cbddf28f0773988f58a9327ac9167 "D_10__Strombomonas triquetra"     
## de3101e77798ba61df4cc177173529fd "D_10__Strombomonas triquetra"     
## 4d4480be2894f09f18c03371a81307fa "D_10__Strombomonas schauinslandii"
## b05bb27ac9ddea225db70bfd60676f35 "D_10__Strombomonas schauinslandii"
tax_table(phyloseq_euglenida)[1589:1609, 1:7]
## Taxonomy Table:     [21 taxa by 7 taxonomic ranks]:
##                                  Kingdom     Phylum     Class     Order      
## 46902a3b58f2fdd47db07ab97f87d752 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## ae8a4d797f8e3dece67e61516d5bdb83 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 8c5562f52d4d49fb784a350a078c2ba5 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## e447cd0b64cfe2f052076dfd61e799e3 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 99df6b37ae090998484acdaa2823c721 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 59d2205677ae4d2d55381ad9df617251 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## fe4996ede0cc114227eb867268311a0f "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 42d685f225c79f29d512f2deb97ac923 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 7735d5a32c709e70a181180879d730ed "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 3452f4cfc4c64b6c82eb8b8e03eb5114 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## eb92b058515aea65e5f2e63b37ef9375 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 24cc370ae8a4e2b00c8b71d252ea3e0c "Eukaryota" "Excavata" "Discoba" "Euglenida"
## ec9b1ecc0fccb1aef783adf749e662f8 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## af51c3ebc8761d42e4b3559b08c15e6e "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 1127995efe19b7116dddedab9c5052f3 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 0e2e0b9f99f38c5f7d756a1b77a28195 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 546c61b857db3572bc1d3cea74893d5e "Eukaryota" "Excavata" "Discoba" "Euglenida"
## d9009c6ae559bc7155354ac2fada2d16 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 924075902e5c8d005c906834b5210036 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## c75a6e9b08870722ddca1f76a3989557 "Eukaryota" "Excavata" "Discoba" "Euglenida"
## 849d76f6172e474b597d1ba2a65b9807 "Eukaryota" "Excavata" "Discoba" "Euglenida"
##                                  Family        Genus          
## 46902a3b58f2fdd47db07ab97f87d752 "Euglenaceae" "Trachelomonas"
## ae8a4d797f8e3dece67e61516d5bdb83 "Euglenaceae" "Trachelomonas"
## 8c5562f52d4d49fb784a350a078c2ba5 "Euglenaceae" "Trachelomonas"
## e447cd0b64cfe2f052076dfd61e799e3 "Euglenaceae" "Trachelomonas"
## 99df6b37ae090998484acdaa2823c721 "Euglenaceae" "Trachelomonas"
## 59d2205677ae4d2d55381ad9df617251 "Euglenaceae" "Trachelomonas"
## fe4996ede0cc114227eb867268311a0f "Euglenaceae" "Trachelomonas"
## 42d685f225c79f29d512f2deb97ac923 "Euglenaceae" "Trachelomonas"
## 7735d5a32c709e70a181180879d730ed "Euglenaceae" "Trachelomonas"
## 3452f4cfc4c64b6c82eb8b8e03eb5114 "Euglenaceae" "Trachelomonas"
## eb92b058515aea65e5f2e63b37ef9375 "Euglenaceae" "Trachelomonas"
## 24cc370ae8a4e2b00c8b71d252ea3e0c "Euglenaceae" "Trachelomonas"
## ec9b1ecc0fccb1aef783adf749e662f8 "Euglenaceae" "Trachelomonas"
## af51c3ebc8761d42e4b3559b08c15e6e "Euglenaceae" "Trachelomonas"
## 1127995efe19b7116dddedab9c5052f3 "Euglenaceae" "Trachelomonas"
## 0e2e0b9f99f38c5f7d756a1b77a28195 "Euglenaceae" "Trachelomonas"
## 546c61b857db3572bc1d3cea74893d5e "Euglenaceae" "Trachelomonas"
## d9009c6ae559bc7155354ac2fada2d16 "Euglenaceae" "Trachelomonas"
## 924075902e5c8d005c906834b5210036 "Euglenaceae" "Trachelomonas"
## c75a6e9b08870722ddca1f76a3989557 "Euglenaceae" "Trachelomonas"
## 849d76f6172e474b597d1ba2a65b9807 "Euglenaceae" "Trachelomonas"
##                                  Species                          
## 46902a3b58f2fdd47db07ab97f87d752 "D_10__Trachelomonas compacta"   
## ae8a4d797f8e3dece67e61516d5bdb83 "D_10__Trachelomonas compacta"   
## 8c5562f52d4d49fb784a350a078c2ba5 "D_10__Trachelomonas compacta"   
## e447cd0b64cfe2f052076dfd61e799e3 "D_10__Trachelomonas compacta"   
## 99df6b37ae090998484acdaa2823c721 "D_10__Trachelomonas compacta"   
## 59d2205677ae4d2d55381ad9df617251 "D_10__Trachelomonas compacta"   
## fe4996ede0cc114227eb867268311a0f "D_10__Trachelomonas compacta"   
## 42d685f225c79f29d512f2deb97ac923 "D_10__Trachelomonas compacta"   
## 7735d5a32c709e70a181180879d730ed "D_10__Trachelomonas compacta"   
## 3452f4cfc4c64b6c82eb8b8e03eb5114 "D_10__Trachelomonas manschurica"
## eb92b058515aea65e5f2e63b37ef9375 "D_10__Trachelomonas compacta"   
## 24cc370ae8a4e2b00c8b71d252ea3e0c "D_10__Trachelomonas compacta"   
## ec9b1ecc0fccb1aef783adf749e662f8 "D_10__Trachelomonas compacta"   
## af51c3ebc8761d42e4b3559b08c15e6e "D_10__Trachelomonas compacta"   
## 1127995efe19b7116dddedab9c5052f3 "D_10__Trachelomonas compacta"   
## 0e2e0b9f99f38c5f7d756a1b77a28195 "D_10__Trachelomonas compacta"   
## 546c61b857db3572bc1d3cea74893d5e "D_10__Trachelomonas compacta"   
## d9009c6ae559bc7155354ac2fada2d16 "D_10__Trachelomonas compacta"   
## 924075902e5c8d005c906834b5210036 "D_10__Trachelomonas compacta"   
## c75a6e9b08870722ddca1f76a3989557 "D_10__Trachelomonas compacta"   
## 849d76f6172e474b597d1ba2a65b9807 "D_10__Trachelomonas compacta"

Sprawdzam ile jest unikatowych przyporządkowaƄ taksonomicznych na poziomie gatunku

unique_euglenida = get_taxa_unique(phyloseq_euglenida, taxonomic.rank=rank_names(phyloseq_euglenida)[7])
ntaxa(phyloseq_euglenida)
## [1] 1609
length(unique_euglenida)
## [1] 131

PoniewaĆŒ niewiele przyporządkowaƄ na poziomie gatunku jest nieznanych (maƂo wartoƛci NaN) i jest tylko 131 unikatowych gatunkĂłw, wydaje się, ĆŒe poƂączenie ASV z tych samych gatunkĂłw będzie dobrą decyzją. Usprawni to obliczenia i pozwoli na bardziej czytelną wizualizację wynikĂłw, przy niewielkiej stracie danych. Dalsze analizy przeprowadzę zatem na ASV poƂączonych dla tych samych gatunkĂłw.

phyloseq_glom <- tax_glom(phyloseq_euglenida, "Species")

Sprawdzam liczbę taksonów i próbek, maksymalną i minimalną liczbę odczytów w próbce, poziomy taksonomiczne w poƂączonych danych

ntaxa(phyloseq_glom)
## [1] 130
nsamples(phyloseq_glom)
## [1] 24
max(sample_sums(phyloseq_glom))
## [1] 21208
min(sample_sums(phyloseq_glom))
## [1] 7268
rank_names(phyloseq_glom)
## [1] "Kingdom" "Phylum"  "Class"   "Order"   "Family"  "Genus"   "Species"

PoƂączenie ASV z tych samych gatunkĂłw zgodnie z oczekiwaniami zmiejszyƂo liczbę taksonĂłw, ale w większoƛci są to taksony, ktĂłre się powtarzaƂy. Nieznacznie zmieniƂy się rĂłwnieĆŒ minimalna i maksymalna liczba odczytĂłw w prĂłbce, ale rĂłwnieĆŒ jest to spodziewany wynik, poniewaĆŒ zmieniƂy się dane - poƂączono sekwencje z tych samych gatunkĂłw i odrzucono wartoƛci NaN. Nie wydaje się jednak, ĆŒeby byƂa to znacząca rĂłĆŒnica w porĂłwnaniu z danymi niepoƂączonymi. Nie zmieniƂa się liczba prĂłbek, ani poziomy taksonimiczne.

Krzywa wysycenia

Tworzę krzywą alfa-curve

tab = data.frame(t(otu_table(phyloseq_glom)))
#png(file="../images/alfa-curve.png", width=800, height=600, res=100)
rarecurve(tab, step=50, cex=0.9, label=FALSE, col=rainbow(24), lwd = 3)
## Warning in rarecurve(tab, step = 50, cex = 0.9, label = FALSE, col =
## rainbow(24), : most observed count data have counts 1, but smallest count is 2

#dev.off()

Na podstawie krzywej moĆŒna stwierdzć, ĆŒe liczba odczytĂłw w najmniej licznej prĂłbce wydaje się wystarczająca, ĆŒeby moĆŒna byƂo do tej wartoƛci ograniczyć liczbę odczytĂłw we wszystkich prĂłbkach bez znaczącej straty informacji o bogactwie gatunkowym i rĂłĆŒnorodnoƛci biologicznej. Najmniej liczna prĂłbka zawiera 7268 odczytĂłw. Jest to mniej więcej 1/3 liczebnoƛci odczytĂłw w najbardziej licznej prĂłbce, ale w porĂłwnaniu z resztą prĂłbek ta wartoƛć nie odbiega znacząco. Najprawdopodobniej pewne dane zostaną utracone, ale warto ograniczyć liczbę odczytĂłw, poniewaĆŒ usprawni to obliczenia, a sama strata będzie niekrytyczna. Wszystkie prĂłbki ograniczę zatem do 7268 odczytĂłw.

Rarefaction

Ograniczam liczbę odczytów we wszystkich próbkach

phyloseq_rare <- rarefy_even_depth(phyloseq_glom, sample.size=min(sample_sums((phyloseq_glom))), replace=F)
## You set `rngseed` to FALSE. Make sure you've set & recorded
##  the random seed of your session for reproducibility.
## See `?set.seed`
## ...
## 1OTUs were removed because they are no longer 
## present in any sample after random subsampling
## ...

Sprawdzam liczbę taksonów i próbek, maksymalną i minimalną liczbę odczytów w próbce, poziomy taksonomiczne w ograniczonych danych

ntaxa(phyloseq_rare)
## [1] 129
nsamples(phyloseq_rare)
## [1] 24
max(sample_sums(phyloseq_rare))
## [1] 7268
min(sample_sums(phyloseq_rare))
## [1] 7268
rank_names(phyloseq_rare)
## [1] "Kingdom" "Phylum"  "Class"   "Order"   "Family"  "Genus"   "Species"

Po ograniczeniu danych nie utracono ĆŒadnych taksonĂłw, ani ĆŒadnej prĂłbki. Wydaje się, ĆŒe ograniczenie odczytĂłw w prĂłbkach nie zaburzyƂo znacząco danych i ich rĂłĆŒnorodnoƛci.

Analiza alfa-rĂłĆŒnorodnoƛci

c25 <- c(
  "dodgerblue2", "#E31A1C", # red
  "green4",
  "#6A3D9A", # purple
  "#FF7F00", # orange
  "black", "gold1",
  "skyblue2", "#FB9A99", # lt pink
  "palegreen2",
  "#CAB2D6", # lt purple
  "#FDBF6F", # lt orange
  "gray70", "khaki2",
  "maroon", "orchid1", "deeppink1", "blue1", "steelblue4",
  "darkturquoise", "green1", "yellow4", "yellow3",
  "darkorange4", "brown"
)
#png(file="../images/alfa_diversity.png", width=1500, height=800, res=100)
plot_richness(phyloseq_rare, measures = c("Observed", "Shannon", "Simpson"), x ="rok" , color = "zbiornik")+ geom_boxplot(aes(fill = region), alpha=0.2) +  scale_color_manual(values = c25)+ scale_fill_manual(values = c25) +ggtitle("Alfa-rĂłĆŒnorodnoƛć")+theme(plot.title = element_text(size = 20, face = "bold"), text = element_text(size=20), axis.text.x = element_text(angle=45, hjust=0.5))

#dev.off()
diversity <- estimate_richness(phyloseq_rare, measures = c("Observed", "Shannon", "Simpson"))
data_alphadiv <- cbind(sample_data(phyloseq_rare), diversity)
summary(aov(Observed ~ rok, data = data_alphadiv))
##             Df Sum Sq Mean Sq F value Pr(>F)
## rok          1    425   425.0   1.262  0.273
## Residuals   22   7409   336.8
summary(aov(Observed ~ zbiornik, data = data_alphadiv))
##             Df Sum Sq Mean Sq F value  Pr(>F)    
## zbiornik     3   5147  1715.8   12.77 6.9e-05 ***
## Residuals   20   2687   134.3                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(aov(Observed ~ region, data = data_alphadiv))
##             Df Sum Sq Mean Sq F value   Pr(>F)    
## region       1   4959    4959   37.96 3.35e-06 ***
## Residuals   22   2875     131                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(aov(Shannon ~ rok, data = data_alphadiv))
##             Df Sum Sq Mean Sq F value Pr(>F)
## rok          1  0.251  0.2505   0.487  0.493
## Residuals   22 11.321  0.5146
summary(aov(Shannon ~ zbiornik, data = data_alphadiv))
##             Df Sum Sq Mean Sq F value Pr(>F)  
## zbiornik     3  3.797  1.2655   3.256 0.0432 *
## Residuals   20  7.775  0.3887                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(aov(Shannon ~ region, data = data_alphadiv))
##             Df Sum Sq Mean Sq F value Pr(>F)  
## region       1  2.490  2.4904   6.034 0.0224 *
## Residuals   22  9.081  0.4128                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(aov(Simpson ~ rok, data = data_alphadiv))
##             Df Sum Sq Mean Sq F value Pr(>F)
## rok          1 0.0132 0.01320   0.438  0.515
## Residuals   22 0.6625 0.03011
summary(aov(Simpson ~ zbiornik, data = data_alphadiv))
##             Df Sum Sq Mean Sq F value Pr(>F)
## zbiornik     3 0.1219 0.04062   1.467  0.254
## Residuals   20 0.5539 0.02769
summary(aov(Simpson ~ region, data = data_alphadiv))
##             Df Sum Sq Mean Sq F value Pr(>F)
## region       1 0.0393 0.03927   1.358  0.256
## Residuals   22 0.6365 0.02893

Beta-rĂłĆŒnorodnoƛć

Macierze odlegƂoƛci

library(reshape2)
## 
## DoƂączanie pakietu: 'reshape2'
## Następujący obiekt zostaƂ zakryty z 'package:tidyr':
## 
##     smiths
plot_dist_as_heatmap <- function(dist, order = sampleOrder, title = NULL) {
  data <- melt(as(dist, "matrix"))
  colnames(data) <- c("x", "y", "distance")
  if (!is.null(order)) {
    data$x <- factor(data$x, levels = order)
    data$y <- factor(data$y, levels = order)
  }
  p <- ggplot(data, aes(x = x, y = y, fill = distance)) + geom_tile() 
  p <- p + theme(axis.title.x = element_blank(), 
                 axis.title.y = element_blank(), 
                 #axis.text.x = element_blank(), 
                 #axis.text.y = element_blank()
  )
  p <- p + scale_fill_continuous(limits = c(0, 1))
  if (!is.null(title)) {
    p <- p + ggtitle(title)
  }
  return(p)
}

sampleOrder = levels(reorder(sample_names(phyloseq_rare), as.numeric(get_variable(phyloseq_rare, "zbiornik"))))
#png(file="../images/bray_dist.png", width=1200, height=800, res=100)
bray_dist = distance(phyloseq_rare, method="bray")
plot_dist_as_heatmap(bray_dist, title = "Bray-Curtis")+theme(plot.title = element_text(size = 20, face = "bold"), text = element_text(size=20), axis.text.x = element_text(angle=45, hjust=1))

#dev.off()
#png(file="../images/unifrac_dist.png", width=1200, height=800, res=100)
unifrac_dist= distance(phyloseq_rare, method="unifrac")
plot_dist_as_heatmap(unifrac_dist, title = "Unifrac")+theme(plot.title = element_text(size = 20, face = "bold"), text = element_text(size=20), axis.text.x = element_text(angle=45, hjust=1))

#dev.off()
#png(file="../images/jaccard_dist.png", width=1200, height=800, res=100)
jaccard_dist = distance(phyloseq_rare, method="jaccard", binary = TRUE)
plot_dist_as_heatmap(jaccard_dist, title = "Jaccard")+theme(plot.title = element_text(size = 20, face = "bold"), text = element_text(size=20), axis.text.x = element_text(angle=45, hjust=1))

#dev.off()

Ordynacje

Najprostsza ordynacja - dendogramy

#png(file="../images/clustering_bray.png", width=1200, height=800, res=100)
clustering_bray = hclust(bray_dist)
plot(clustering_bray)

#dev.off()
#png(file="../images/clustering_unifrac.png", width=1200, height=800, res=100)
clustering_unifrac = hclust(unifrac_dist)
plot(clustering_unifrac)

#dev.off()
#png(file="../images/clustering_jaccard.png", width=1200, height=800, res=100)
clustering_jaccard = hclust(jaccard_dist)
plot(clustering_jaccard)

#dev.off()

NMDS

#png(file="../images/ord_bray.png", width=1200, height=800, res=100)
ord_bray <- ordinate(phyloseq_rare, method = "NMDS", distance = "bray")
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.1127009 
## Run 1 stress 0.1299873 
## Run 2 stress 0.1138069 
## Run 3 stress 0.1458413 
## Run 4 stress 0.1138069 
## Run 5 stress 0.1302466 
## Run 6 stress 0.2734871 
## Run 7 stress 0.27806 
## Run 8 stress 0.1138069 
## Run 9 stress 0.1127009 
## ... Procrustes: rmse 1.150362e-05  max resid 2.397073e-05 
## ... Similar to previous best
## Run 10 stress 0.1502923 
## Run 11 stress 0.1127009 
## ... Procrustes: rmse 3.707525e-06  max resid 7.497034e-06 
## ... Similar to previous best
## Run 12 stress 0.1127009 
## ... Procrustes: rmse 8.476871e-07  max resid 1.924919e-06 
## ... Similar to previous best
## Run 13 stress 0.1127009 
## ... Procrustes: rmse 4.468216e-06  max resid 8.348936e-06 
## ... Similar to previous best
## Run 14 stress 0.1127009 
## ... Procrustes: rmse 1.434503e-06  max resid 4.562283e-06 
## ... Similar to previous best
## Run 15 stress 0.1138069 
## Run 16 stress 0.1302466 
## Run 17 stress 0.1138069 
## Run 18 stress 0.1127009 
## ... Procrustes: rmse 7.063661e-06  max resid 1.439686e-05 
## ... Similar to previous best
## Run 19 stress 0.1138069 
## Run 20 stress 0.1299873 
## *** Best solution repeated 6 times
plot_ordination(phyloseq_rare, ord_bray, color="region", label = "rok") + scale_color_manual(values = c25)+ scale_fill_manual(values = c25)+ theme_light()+ geom_point(aes(color = region, shape = pora_roku), alpha = 0.9, size = 3.5)+stat_ellipse(geom = "polygon", aes(fill = zbiornik), alpha = 0.2, linetype =0) +ggtitle("Bray")+theme(plot.title = element_text(size = 20, face = "bold"), text = element_text(size=20))

#dev.off()
#png(file="../images/ord_unifrac.png", width=1200, height=800, res=100)
ord_unifrac <- ordinate(phyloseq_rare, method = "NMDS", distance = "unifrac")
## Run 0 stress 0.1374596 
## Run 1 stress 0.1633944 
## Run 2 stress 0.139172 
## Run 3 stress 0.1386711 
## Run 4 stress 0.1413901 
## Run 5 stress 0.1376384 
## ... Procrustes: rmse 0.005587015  max resid 0.0201143 
## Run 6 stress 0.1374596 
## ... Procrustes: rmse 2.166434e-05  max resid 4.72632e-05 
## ... Similar to previous best
## Run 7 stress 0.15329 
## Run 8 stress 0.1374596 
## ... Procrustes: rmse 1.494077e-05  max resid 4.016958e-05 
## ... Similar to previous best
## Run 9 stress 0.1386711 
## Run 10 stress 0.1563081 
## Run 11 stress 0.1542514 
## Run 12 stress 0.139172 
## Run 13 stress 0.1543685 
## Run 14 stress 0.1386711 
## Run 15 stress 0.1602453 
## Run 16 stress 0.1732924 
## Run 17 stress 0.1457666 
## Run 18 stress 0.169975 
## Run 19 stress 0.1456683 
## Run 20 stress 0.1502568 
## *** Best solution repeated 2 times
plot_ordination(phyloseq_rare, ord_unifrac, color="region", label = "rok") + scale_color_manual(values = c25)+ scale_fill_manual(values = c25)+ theme_light()+ geom_point(aes(color = region, shape = pora_roku), alpha = 0.9, size = 3.5)+stat_ellipse(geom = "polygon", aes(fill = zbiornik), alpha = 0.2, linetype =0) +ggtitle("Unifrac")+theme(plot.title = element_text(size = 20, face = "bold"), text = element_text(size=20))

#dev.off()
#png(file="../images/ord_jaccard.png", width=1200, height=800, res=100)
ord_jaccard <- ordinate(phyloseq_rare, method = "NMDS", distance = "jaccard")
## Square root transformation
## Wisconsin double standardization
## Run 0 stress 0.1127009 
## Run 1 stress 0.1127009 
## ... Procrustes: rmse 5.87677e-06  max resid 2.281841e-05 
## ... Similar to previous best
## Run 2 stress 0.1127009 
## ... Procrustes: rmse 9.619624e-06  max resid 2.157678e-05 
## ... Similar to previous best
## Run 3 stress 0.1127009 
## ... New best solution
## ... Procrustes: rmse 1.891421e-06  max resid 5.318118e-06 
## ... Similar to previous best
## Run 4 stress 0.1138069 
## Run 5 stress 0.3720948 
## Run 6 stress 0.1127009 
## ... Procrustes: rmse 1.719132e-06  max resid 4.989827e-06 
## ... Similar to previous best
## Run 7 stress 0.1127009 
## ... Procrustes: rmse 3.038327e-06  max resid 1.185934e-05 
## ... Similar to previous best
## Run 8 stress 0.1138069 
## Run 9 stress 0.1127009 
## ... Procrustes: rmse 2.570571e-05  max resid 5.756795e-05 
## ... Similar to previous best
## Run 10 stress 0.1299873 
## Run 11 stress 0.1302466 
## Run 12 stress 0.1138069 
## Run 13 stress 0.1302466 
## Run 14 stress 0.1138069 
## Run 15 stress 0.1127009 
## ... Procrustes: rmse 9.471626e-06  max resid 2.326657e-05 
## ... Similar to previous best
## Run 16 stress 0.1302466 
## Run 17 stress 0.1127009 
## ... Procrustes: rmse 4.283744e-06  max resid 8.095503e-06 
## ... Similar to previous best
## Run 18 stress 0.1138069 
## Run 19 stress 0.1444942 
## Run 20 stress 0.1302466 
## *** Best solution repeated 6 times
plot_ordination(phyloseq_rare, ord_jaccard, color="region", label = "rok") + scale_color_manual(values = c25)+ scale_fill_manual(values = c25)+ theme_light()+ geom_point(aes(color = region, shape = pora_roku), alpha = 0.9, size = 3.5)+stat_ellipse(geom = "polygon", aes(fill = zbiornik), alpha = 0.2, linetype =0) +ggtitle("Jaccard")+theme(plot.title = element_text(size = 20, face = "bold"), text = element_text(size=20))

#dev.off()

Anosim

rok = get_variable(phyloseq_rare, "rok")
zb = get_variable(phyloseq_rare, "zbiornik")
region = get_variable(phyloseq_rare, "region")
anosim(bray_dist, grouping = rok)
## 
## Call:
## anosim(x = bray_dist, grouping = rok) 
## Dissimilarity: bray 
## 
## ANOSIM statistic R: -0.01247 
##       Significance: 0.502 
## 
## Permutation: free
## Number of permutations: 999
anosim(bray_dist, grouping = zb)
## 
## Call:
## anosim(x = bray_dist, grouping = zb) 
## Dissimilarity: bray 
## 
## ANOSIM statistic R: 0.8725 
##       Significance: 0.001 
## 
## Permutation: free
## Number of permutations: 999
anosim(bray_dist, grouping = region)
## 
## Call:
## anosim(x = bray_dist, grouping = region) 
## Dissimilarity: bray 
## 
## ANOSIM statistic R: 0.5735 
##       Significance: 0.001 
## 
## Permutation: free
## Number of permutations: 999
anosim(unifrac_dist, grouping = rok)
## 
## Call:
## anosim(x = unifrac_dist, grouping = rok) 
## Dissimilarity: 
## 
## ANOSIM statistic R: 0.02431 
##       Significance: 0.244 
## 
## Permutation: free
## Number of permutations: 999
anosim(unifrac_dist, grouping = zb)
## 
## Call:
## anosim(x = unifrac_dist, grouping = zb) 
## Dissimilarity: 
## 
## ANOSIM statistic R: 0.7807 
##       Significance: 0.001 
## 
## Permutation: free
## Number of permutations: 999
anosim(unifrac_dist, grouping = region)
## 
## Call:
## anosim(x = unifrac_dist, grouping = region) 
## Dissimilarity: 
## 
## ANOSIM statistic R: 0.783 
##       Significance: 0.001 
## 
## Permutation: free
## Number of permutations: 999
anosim(jaccard_dist, grouping = rok)
## 
## Call:
## anosim(x = jaccard_dist, grouping = rok) 
## Dissimilarity: binary jaccard 
## 
## ANOSIM statistic R: 0.02583 
##       Significance: 0.272 
## 
## Permutation: free
## Number of permutations: 999
anosim(jaccard_dist, grouping = zb)
## 
## Call:
## anosim(x = jaccard_dist, grouping = zb) 
## Dissimilarity: binary jaccard 
## 
## ANOSIM statistic R: 0.8589 
##       Significance: 0.001 
## 
## Permutation: free
## Number of permutations: 999
anosim(jaccard_dist, grouping = region)
## 
## Call:
## anosim(x = jaccard_dist, grouping = region) 
## Dissimilarity: binary jaccard 
## 
## ANOSIM statistic R: 0.8443 
##       Significance: 0.001 
## 
## Permutation: free
## Number of permutations: 999

Permanova

metadane <- data.frame(sample_data(phyloseq_rare))
adonis(bray_dist ~ rok, data = metadane)$aov.tab
## 'adonis' will be deprecated: use 'adonis2' instead
adonis(bray_dist ~ zbiornik, data = metadane)$aov.tab
## 'adonis' will be deprecated: use 'adonis2' instead
adonis(bray_dist ~ region, data = metadane)$aov.tab
## 'adonis' will be deprecated: use 'adonis2' instead
adonis(unifrac_dist ~ rok, data = metadane)$aov.tab
## 'adonis' will be deprecated: use 'adonis2' instead
adonis(unifrac_dist ~ zbiornik, data = metadane)$aov.tab
## 'adonis' will be deprecated: use 'adonis2' instead
adonis(unifrac_dist ~ region, data = metadane)$aov.tab
## 'adonis' will be deprecated: use 'adonis2' instead
adonis(jaccard_dist ~ rok, data = metadane)$aov.tab
## 'adonis' will be deprecated: use 'adonis2' instead
adonis(jaccard_dist ~ zbiornik, data = metadane)$aov.tab
## 'adonis' will be deprecated: use 'adonis2' instead
adonis(jaccard_dist ~ region, data = metadane)$aov.tab
## 'adonis' will be deprecated: use 'adonis2' instead

Analiza skƂadu taksonomicznego

#png(file="../images/barplot.png", width=1500, height=800, res=100)
plot_bar(phyloseq_rare, fill = "Genus", x="rok") + geom_bar(stat="identity") + theme_bw() + facet_wrap("zbiornik", scales="free") + scale_fill_manual(values=c25) +ggtitle("Wykres sƂupkowy skƂadu taksonomicznego")+ theme(plot.title = element_text(size = 20, face = "bold"), text = element_text(size=20), axis.text.x = element_text(angle=0, hjust=0.5))

#dev.off()
#png(file="../images/heatmap.png", width=3000, height=4000, res=300)
edited = phyloseq_rare
tax_table(edited)[,"Species"] <- gsub("D_10__", "", tax_table(edited)[,"Species"])
plot_heatmap(edited, taxa.label="Species", low ="blue", high="red", na.value="white",taxa.order="Species")+facet_wrap("zbiornik", scales="free")+ggtitle("Mapa ciepƂa skƂadu taksonomicznego")+ theme(plot.title = element_text(size = 20, face = "bold"), text = element_text(size=20), axis.text.x = element_text(angle=45, hjust=0.5))
## Warning: Transformation introduced infinite values in discrete y-axis

#dev.off()
#png(file="../images/drzewo_kropki.png", width=3000, height=4000, res=300)
plot_tree(phyloseq_rare, ladderize="left", color="zbiornik", shape = "rok", size="abundance", label.tips="Genus", text.size=2, base.spacing=0.04)+ggtitle("Drzewo z kropkami skƂadu taksonomicznego")+ theme(plot.title = element_text(size = 20, face = "bold"), text = element_text(size=20))

#dev.off()